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We consider the problem of obtaining sparse positional strategies for safety games. Such games 
are a commonly used model in many formal methods, as they make the interaction of a system 
with its environment explicit. Example applications are the synthesis of finite-state systems from 
specifications in temporal logic and alternating-time temporal logic (ATL) model checking. Often, a 
winning strategy for one of the players is used as a certificate or as an artefact for further processing 
in the application. Small such certificates, i.e., strategies that can be written down very compactly, 
are typically preferred. For safety games, we only need to consider positional strategies. These map 
game positions of a player onto a move that is to be taken by the player whenever the play enters that 
position. For representing positional strategies compactly, a common goal is to minimize the number 
of positions for which a winning player's move needs to be defined such that the game is still won by 
the same player, without visiting a position with an undefined next move. We call winning strategies 
in which the next move is defined for few of the player's positions sparse. From a sparse winning 
positional strategy for the safety player in a synthesis game, we can compute a small implementation 
satisfying the specification used for building the game, and for ATL model checking, sparse strategies 
are easier to comprehend and thus help in analysing the cause of a model checking result. 

Unfortunately, even roughly approximating the density of the sparsest strategy for a safety game 
has been shown to be NP-hard. Thus, to obtain sparse strategies in practice, one either has to ap- 
ply some heuristics, or use some exhaustive search technique, like ILP (integer linear programming) 
solving. In this paper, we perform a comparative study of currently available methods to obtain 
sparse winning strategies for the safety player in safety games. Approaches considered include the 
techniques from common knowledge, such as using ILP or SAT (satisfiability) solving, and a novel 
technique based on iterative linear programming. The restriction to safety games is not only moti- 
vated by the fact that they are the simplest game model for continuous interaction between a system 
and its environment, and thus an evaluation of strategy extraction methods should start here, but also 
by the fact that they are sufficient for many applications, such as synthesis. The results of this pa- 
per shed light onto which directions of research in this area are the promising ones, and if current 
techniques are already scalable enough for practical use. 

1 Introduction 

Games with G)-regular winning conditions have been proven to be valuable tools for the construction and 
analysis of complex systems and are suitable computation models for logics such as the monadic second- 
order logic of one or two successors lfT0l[T6l in iT7]| . By reducing a decision problem to determining the 
winning player in a game, the algorithmic aspect of solving the problem can easily be separated from the 
details of the application under concern. Winning strategies for one of the players in a game can be used 
as certificates for the answer to the original problem or serve as artefacts to be used in other steps of the 
application. 

*This work was partially supported by the DFG as part of the Transregional Collaborative Research Center "Automatic 
Verification and Analysis of Complex Systems" (SFB/TR 14 AVACS). 
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For example, when synthesizing finite state systems ifTSl [T3]| from temporal logic specifications, 
the winning strategy for the system player in the corresponding game is an artefact that represents a 
system satisfying the specification, and is used for building circuits that implement the specification. In 
alternating-time temporal logic (ATL) the question is imposed whether agents in a certain setting 
can ensure certain global properties of a system to hold. A winning strategy for one of the players in the 
corresponding model checking game represents a certificate for the fact that the agents can achieve their 
goal or that there exists a counter-strategy for the remaining agents to prevent this. The certificate can 
then be used for human inspection on why or why not the agents can achieve their goal. In -calculus 
model checking, a strategy for the induced model checking game explains why or why not a given system 
satisfies some property. Again, a winning strategy serves as a certificate that is useful for further analysis 
of the setting. 

In all these cases, certificates and artefacts that have a smaller representation are normally preferred. 
Such solutions are easier to comprehend and have (computational) advantages if used in successive steps 
(like building circuits from strategies in a synthesis game) or for analysing why a certain property holds 
or not. While for automata over ft)-regular words, which can be seen as one-player games, there exist 
some results on obtaining compactly representable one-player strategies for Biichi [12J and generalised 
Biichi 131 acceptance conditions, little research has been performed on obtaining compactly representable 
strategies in two-player games, even though it has been noticed that these are desperately needed in 
practice [2]. 

In this paper, we consider the problem of obtaining sparse positional strategies in safety games. 
Whenever a player follows a positional strategy, then the choice of action to perform in one of its posi- 
tions only depends on the position the game is currently in. While positional strategies are too restricted 
to allow representing winning strategies in very expressive game types such as MuUer or Streett games in 
general, for more simple game types such as parity or safety games, it is assured that whenever for one of 
the players, a winning strategy exists, then there also exists a winning positional strategy for that player. 
Positional strategies are suitable for giving insight on why a modal jLt -calculus formula is valid in some 
model or provide information about why a specification is unrealisable in synthesis, as the obligations 
are encoded into the game graph. Technically, positional strategies are represented as functions from the 
positions of a player to the next move of the player. Thus, at a first glance, all strategies have the same 
size. However, if some position is never reachable along a play, then the player's move at that position 
does not matter, and we can leave the move for this position undefined. Positional strategies with many 
undefined moves can be represented more compactly, have the advantages outlined above, and are what 
we aim at computing in this paper. The number of game positions for which a next move is defined 
in a positional strategy is called its density, and strategies with a low density are called sparse in the 
following. 

For applications such as synthesis, positional strategies are not necessarily the best model: a Mealy 
or Moore machine that implements a specification can have far less states than the density of the sparsest 
winning positional strategy for the system player in a corresponding synthesis game. Nevertheless, even 
in synthesis, positional strategies are useful. For example, one of the more recent synthesis approaches, 
namely Bounded Synthesis Ull, can easily be altered such that there exists a positional strategy that 
represents the smallest possible implementation. Furthermore, stronger non-approximability results are 
known for non-positional strategies: it was shown that the number of states of the smallest Mealy or 
Moore machine that implements a winning strategy in a safety game is NP-hard to approximate within 
any polynomial function |4|, while for positional strategies, non-approximability of the density of a 
sparsest strategy is known only within any constant 

While safety games are the main computational model that we aim to tackle, the techniques we 
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compare in this paper are also useful for more expressive game models such as parity games. Extracting 
winning sti^ategies in parity games can be done by computing a strategy that follows some attractor sets 
computed during the game solving process |i8J. If we leave the concrete choice of a successor position 
in such a game open whenever there is more than one possibility to follow the attractor, we obtain a 
non-deterministic strategy that leaves some room for density improvement. Any strategy that is a special 
case of this non-deterministic strategy is a valid winning strategy, just like every strategy that does not 
leave the set of winning positions in a safety game is a valid winning strategy. Thus, the techniques 
discussed here can also be applied to the parity game case, with the drawback that the sparsest winning 
strategy in a parity game might not be a special case of the non-deterministic strategy computed from the 
attractor sets observed during the game solving process, and thus may be missed. Nevertheless, as there 
is, to the best of our knowledge, no work on sparse strategies in parity games yet, using an approach to 
obtain sparse winning strategies in safety games is still the best technique available so far. 

We compare a variety of techniques for obtaining sparse winning strategies in this paper. Apart 
from a fully randomized heuristic, which will serve as a comparison basis, we use a smarter randomized 
heuristic that finds locally optimal strategies and consider the usage of SAT and ILP solvers to obtain a 
sparsest strategy. A novel technique, based on the repeated application of a linear programming solver 
to obtain hints on which game position to add to the strategy domain next provides a trade-off between 
the density of the strategy and the computation time needed. For comparison, we also consider a recent 
algorithm by Neider [14J, which uses computational learning to obtain small non-positional strategies. 
As there is no standard benchmark set available for safety games, we take games from the Bounded 
Synthesis domain. 

We start the following presentation by defining safety games. As we compare the techniques to obtain 
sparse positional winning strategies against the computational learning approach, which produces non- 
positional strategies, we use an action-based definition of safety games, which ensures that the strategy 
types stand on a common ground. In Section [3l we describe the techniques considered to find sparse 
strategies. Then, in Section IH we briefly describe the benchmarks used. Preceded by a short description 
of the experimental setup (including the tools used), we then state the experimental results in Section [5] 
We close with a discussion of the results and indicate open problems. 

Due to space restrictions, we do not describe how the computational learning-based strategy finding 
approach [T4l works and how to produce games from specifications in the Bounded Synthesis [151 pro- 
cess. Rather, we assume familiarity with the subjects in the corresponding sections [331 and l4l and only 
explain the connection to this work. 

2 Preliminaries 

2.1 Safety Games 

A safety game is defined as a tuple ^ = {V^ ,lP ,1.^ ,v^""). In the game, we have two competing 

players, namely player and player 1. Player has the (finite) set of positions V^, the (finite) set of 
actions L^, and the partial edge function : x ^ {V^ l+) V^). Player 1 in turn has her set of 
positions V\ her set of actions and her edge function -.V^ xl.^ ^ (V*' l+) ). The game also has 
a designated initial position v'"'* G (V" l+) V^). For simplicity, we define V = tt) I = 1° l+l I^, and 
E -.V xL with E{v,x) = E^\v,x) if E^{v,x) is defined and E{v,x) = E^{v,x) otherwise as shortcuts 
to be used in the following. If for some position v G V and action x £L,we have E{v,x) = v' for some 
position v', then we call v' a successor of v. The set of successors of a position v is also denoted by 
succ(v). 
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In a play of the game, the players move a pebble along the positions in the game. Starting from the 
initial position, whenever the pebble is in a position v € then player chooses an action x £lP and 
moves the pebble to position £'"(v,x). The case of the pebble being in a position v is analogous 
for player 1. By concatenating the actions taken by the two players along the play, we obtain a decision 
sequence in the game. 

Given a set X, we denote the set of finite sequences of X by X*, and the set of infinite sequences of 
Xhy X"*. A sequence n = Honxllj--- € V" U V* is then a play with a corresponding decision sequence 
p = P0P1P2 . . . G I'^ if ;ro = v""' and for all / G N, if tt; G then p; G and 7r/+i = E°{7ii,pi) (or 
/ = |7r| — 1 if E^{7li,Pi) is undefined), and if tt, G V^, then p,- G and 7r,+i = E^{7li,Pi) (or / = |7r| — 1 
if E^{7ii,p,-) is undefined). Note that for every decision sequence, there is precisely one play to which it 
corresponds. Plays in a game are either winning for player or player 1. Finite plays tt = tTq • • • 7r« for 
which we have n„ G are winning for player 1, whereas for ;r„ G the play is winning for player 0. 
Infinite plays are won by player 0. 

2.2 Strategies 

When playing the game, a player may follow a predefined strategy. Formally, a strategy for player 
p G {0, 1} is simply a function /:£*—> Z^. A decision sequence p is said to correspond to / if for the 
play n that p corresponds to and all / G N, if 71,- G V'\ then p, = /(tTq . . . TT,). If all decision sequences that 
correspond to a given strategy of player p induce only plays that are winning for player p, then we call 
the strategy / winning. 

In safety games, it is assured that one of the two players has a winning strategy (see, e.g., fl^). 
If player p has a strategy to win the game, then we say that player p wins the game. We can re- 
strict our attention to a special kind of strategies, namely positional strategies. We call a strategy 
f -.1.* positional if for all pairs of prefix decision sequences p = po . . . p„ and p' = Pg . . . p^, if 

£■(. . .E{v'"",po),. . . ,p„) =E{. . . E {v'"" , p[f) , . . . ,p,',), then /(p) = f{p'). In other words, at any position 
in a play, the next decision of a player that follows a positional strategy only depends on the position the 
play is in at that time. As a consequence, such a positional strategy can also be described by a function 
f : VP ^ LP that maps every position of player p in the game to an action to be chosen by the player 
whenever the position is visited. The restriction to positional strategies is motivated by the fact that in 
safety games, whenever there exists a winning strategy for one of the players, then there also exists a po- 
sitional strategy for the player. The standard algorithm to solve safety games (i.e., determining the winner 
of the game) described in the next sub-section also produces positional strategies as certificates/artefacts. 

For comparing different strategies and in particular finding sparse strategies, we need to define a 
density measure for positional strategies. Recall that the motivation of focusing on sparse strategies is that 
they are better comprehensible certificates and have computational advantages when used as artefacts for 
further processing. For positional strategies, we only need to consider choices from positions in VP that 
are reachable along some path that corresponds to the strategy. If for a positional strategy / : V — > LP, 
there is some position v £ VP that can never be reached along a path that corresponds to a decision 
sequence that in turn corresponds to the strategy, then for the positional strategy /, /(v) can be arbitrary 
without changing the behaviour of the strategy. We thus define the density of a positional strategy for 
player p to be the number of positions of player p that can be visited along some play that corresponds 
to this strategy. 

More formally, we could also define / as a partial function from VP to LP and define the strategy 
density to be the size of the domain of /. In this case, whenever the pebble is in a position vP G VP for 
which f{vP) is undefined, we assume that player p declares that she loses the play. If the strategy is still 
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winning under tliis modified definition of wlio wins a play, tlien tlie fact tliat / is only a partial function 
apparently does not to matter, and / can be considered to be a valid positional strategy. 

2.3 Solving Safety Games 

For discussing the problem of obtaining sparse positional strategies in safety games, it is reasonable to 
separate the complexity of the process of solving the game (which is doable in polynomial time) from 
the actual optimization problem of minimizing the strategy (which is NP-hard). Solving the game means 
to identify the set of winning positions in the game, i.e., those for which if any of these positions is an 
initial one, the safety player (player 0) wins the game. Solving a safety game is relatively simple: it 
can be shown that the set of winning positions is precisely the largest set of positions that (1) does not 
contain a position of player that has no successors, (2) for which for every position of player 0, one 
of its successors is in the set, and (3) for every position of player 1 , all of its successors are in the set. 
This largest set can be computed by starting with all positions, and successively removing any position 
that does not satisfy (1), (2), or (3). Once no more positions can be removed, the game solving process 
is complete. 

Let W be the set of winning positions and v'"" G W. Any positional strategy f :V^ —^iP for which 
for all V G n W, we have that £'''(v,/(v)) G W, is a winning one, as it ensures that the set of winning 
positions is not left, by condition (2) above, player 1 cannot initiate leaving W along a play, and no dead 
end for player is part ofW. At the same time, any positional strategy that allows leaving W at some 
point in a play is not winning. This motivates the description of a most permissive winning strategy for 
player in the game: we define f ^ 2^ with /'(v) = {x G | E^{v,x) G W} for every v G V^, 
as every concrete winning positional strategy must be a speciahzation of /', i.e., have /(v) G /'(v) for 
every position v eV^DW that is reachable along some play that corresponds to /. For a procedure 
to find sparse positional strategies in a safety game, we can thus use /' as a basis for finding a sparse 
specialization. 

3 Approaches for Obtaining Sparse Winning Strategies 

In the experimental evaluation to follow, we compare five techniques to obtain sparse winning strategies 
in games. In this section, we explain them and state the properties of the approaches. We are particu- 
larly interested in sparsest strategies in safety games, i.e., winning positional strategies with the lowest 
possible density. 

3.1 Randomized Strategy Extraction 

Probably the most simple way to obtain a concrete winning positional strategy from a most permissive 
strategy is to simply pick arbitrarily one allowed action for every winning position of player 0, and then 
to remove all positions that became unreachable from the strategy domain. Here, we perform a random 
pick, based on a uniform distribution over the available actions. 

3.2 Smarter Randomized Strategy Extraction 

Given a game ^ = {V^,V\tP,T},E^,E\v'"'\^) with the set of winning position W for player (and 
yinit g another way to describe the problem of obtaining sparse winning positional strategies is to 
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search for an as-large-as-possible set of positions Z C W for which the concrete positional strategy func- 
tion should be undefined. Any strategy that respects Z will then have the same density (as otherwise, 
there is some position that we can add to Z and thus Z is not as large as possible). 

As finding the density of sparsest positional strategies is NP-hard to approximate within any constant 
ll5l . finding an approximately largest set Z is also NP-hard. However, we might settle for local optima of 
Z, i.e., declare ourselves to be satisfied to obtain a set Z such that there is no position of player in the 
game that can be added to Z such that there is still a winning positional strategy that respects Z (i.e., has 
/(v) undefined for every v G Z). Such a set can be obtained in time polynomial in the size of the game 
(i.e., in • + 1^' I ' 1^^)- 

In particular, we can do so as follows: we first create a random permutation of V^, and then for every 
position in the list, examine if the safety game is still winning for player if we remove all outgoing 
edges of that position. Whenever this is the case, we add the position to Z, and continue. Whenever 
the safety game becomes losing for player with this change, we undo it and try the next position in 
the list. Once every position in the list has been tried, we obtained a locally optimal set Z (whose local 
maximality easily be proven by deriving a conti^adiction from assuming the converse). 

Since we randomize the permutation, for every game, there is a non-zero probability of obtaining a 
sparsest strategy. However, it is not hard to define a series of games , , • ■ • for which the sizes of the 
games % grow linearly in /, but for which the probability to obtain a sparsest strategy using the algorithm 
above is at most ~ for every game 

3.3 Integer Linear Programming 

Given a game we can formulate the problem of obtaining a sparse positional winning strategy for 
player as an integer linear programming (ILP) problem, in which we use one variable per position 
in the game. Whenever we obtain a solution to the ILP problem, a variable value of 1 is supposed to 
mean that the position can be reached from the initial position along some path that corresponds to the 
computed strategy, whereas a value of means the opposite. By optimizing the sum of the variables that 
correspond to the vertices of V^, we can obtain a sparsest strategy. 

Formally, an ILP problem is a three-tuple {X,F,C), for which X is a set of variables, F is a linear 
function over X that is to be minimized, and C is a set of linear constraints over the allowed values of X. 
Given a game = {V^ ,lP ,E^\E^ ,v'"" ,,:^) and a most permissive strategy / : — > 2^ , we can 
encode the problem of obtaining a sparsest positional winning strategy for player that is a specialization 
of / into an ILP problem {X,F,C) by setting X = V, F = £„gyo v, and: 

C= U{v>0,v<l}U{v™'>l} 

VGV 

U{-v+ £ £'°(v,x) >0|vGV°}U{-v + v'>0|vGV\v'gsucc(v)}. 

-ve/(v) 

There are four types of constraints in this ILP formulation: first of all, all variable values are fixed 
between and 1 . Then, the variable corresponding to the initial position in the game is forced to be 1 . For 
every position of player whose variable value is > 0, the third kind of constraint ensures that the variable 
for some successor position that is reachable via some action allowed by / has to be set to 1 . Finally, for 
positions of player 1 whose variable has a value of > 0, the variables for all successors positions have to 
be 1. For actually obtaining a positional strategy from a variable assignment a : X — > {0, 1}, for every 
position, we pick an action that leads to a successor in {v G V|<3(v) = 1}. 
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3.4 SAT-based Strategy Extraction 

The ILP formulation of the sparsest positional strategy problem has the property that when regarding the 
variables as Boolean by interpreting as false and 1 as true, all of the constraints can be represented 
as a disjunction of Boolean literals. For example, a constraint — vi +V2 + V3 > can be written as 
-ivi V V2 V V3 in the Boolean domain. By rewriting the ILP instance in this way, SAT (satisfiability) 
solvers can be applied. A SAT-based approach to strategy finding has already been pursued in 171 . 

Most currently available SAT solvers however cannot take into account optimization objectives when 
computing a solution. For using such a solver then, we could encode some cardinality constraint on the 
amount of variables for player O's positions that might be set to true at most, and perform a binary search 
on the best possible strategy density. For this paper, we use the SAT solver OPTS AT V. 1.1 Q that has 
this functionality already built in. 

3.5 Repetitive Linear Programming 

The integer linear programming approach from Section [331 is exact and guaranteed to find a sparsest 
strategy. As the problem of obtaining sparsest winning positional strategies is NP-hard, we cannot ex- 
pect ILP solvers to work fast on ILP instances that encode this problem in general. To counter this fact, 
we propose an alternative approach here, which implements a heuristic based on hnear programming 
over the real numbers (LP). In contrast to ILP solving, LP solving can be performed in polynomial time. 

Consider the constraint system built in the ILP approach of Section 13.31 but this time over the real 
numbers. After applying a linear programming solver to the system, we obtain a variable valuation v, 
which is, w.l.o.g., of the form (vi , V2, • • . , v„j ) for m = \V\. Some values here might be 0, some might be 
1 and in many cases, some values are in between. Thus, the values might not represent an actual solution 
to the sparse strategy problem. We can however fix the vector in an iterative fashion. Suppose that we 
start the linear programming solver on the problem again, but this time fix all variables that were in 
V after the previous solver run to 0, fix all variables that had a value of 1 in v to 1, and additionally fix 
one variable whose value was equal to max{v, | 1 < / < nt,v, / 1} to 1. The linear programming solver 
will compute a new solution, but possibly with a worse value of the objective function. However, the 
number of variables that are not or 1 will have decreased by at least 1. If we iterate the process until 
all variables have values of either or 1, we have a blueprint for a sparse, but not necessarily sparsest 
strategy. However, the complexity of this approach is only polynomial, and we use the LP solver to guide 
our search for a sparse winning strategy. 

3.6 Computational Learning of Sparse Strategies 

Recently, the problem of obtaining compactly representable winning strategies in safety games has been 
tackled from a computational learning perspective by Neider |[T4l . In computational learning of a regular 
language over finite words, the task is to obtain a deterministic finite automaton (DFA) representation of 
such a language using only equality and containment checks. The idea in applying this idea to strategy 
extraction is that we use the prefixes of the winning decision sequences for player in a game as a 
language to be learned, but we can actually stop the learning process after a subset of this language 
has been learned that is closed under appending allowed actions of player 1 (i.e., those actions that are 
available to player 1 at a respective point in the game). The left part of Figure [U depicts an example 
automaton for such a language. 

Note that the automaton is also concerned with actions of player 1, and when taking its number of 
states as size measure, it can easily be larger than the density of a positional strategy. However, at the 
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Figure 1: Translating a DFA that represents winning decision sequences in a safety game with £ = 
{x,y,z}, = {u,v}, and v'"" G into a Mealy machine. Note that as from qi in the DFA, there are 
multiple possible next actions, for the Mealy machine, we just picked any of them (i.e., z). Edges in the 
Mealy machine denote both inputs an outputs. For example, the transition from si to 5^0 is taken when u 
is read in state ^i, and when taking the transition, x is put out. 

same time, a strategy automaton can also be smaller, as it allows to merge states with the same suffix 
language. Also, a strategy DFA might offer more than one possible action to player at any point in the 
play, and there is no guarantee that there actually exists a positional strategy in the game that the DFA 
represents (or overapproximates). As a consequence, the density of the sparsest positional strategy and 
the size of the smallest automaton-based representation of a strategy are incomparable. 

For games that represent some synthesis problem and have strict alternation between the two players 
in the game, positional strategies are not necessarily the model of choice. Typically, when the safety 
player is winning such a game, it is desired to build a Mealy or Moore machine from a winning strategy 
that then represents a reactive system that satisfies the specification that the game is built from. Such a 
Mealy or Moore machine takes the actions of the other player as input and produces player O's actions 
as output. Any trace that the machine may produce must then be a winning decision sequence in the 
original game. A Mealy or Moore machine can have a size (represented by its number of states) that is 
far less than the density of the sparsest winning positional strategy in a game. For example, a game with 
many positions could be winning for the system player by always playing the same action. A machine 
representing this strategy would only have one state, whereas many positions of the safety player in the 
game might be visited along a corresponding play. While it is always possible to translate a winning 
positional strategy of some density n into a Mealy machine of size at most n + 1 (assuming that player 1 
plays first in the game), the DFA produced by a computational learning approach is equally suitable as a 
starting point for a Mealy machine computation: we use the state set of the DFA as state set of the Mealy 
machine, but contract a sequence of two successive transitions that represents the input and the output in 
one round to one transition in the Mealy machine. The number of states that then remain reachable is the 
size of the Mealy machine. Figure [T] illustrates this translation process. For a more thorough definition 
and discussion of the connection between Mealy /Moore machines and games, see IH. 

4 Benchmarks 

To make our experimental evaluation as insightful as possible, we only consider games from practice 
as benchmarks, and leave out the commonly used randomly generated games and toy examples such as 
variants of tic-tac-toe or other folk games. Instead, we use games stemming from Bounded Synthesis 
ifTSl . which is an approach for the synthesis of finite-state systems from specifications in temporal logic. 
Intuitively, a synthesis process that follows this approach starts by representing the specification as an 
automaton that ensures that for every trace of a system to be synthesized that we declare to be illegal, the 
automaton has some corresponding run on which some so-called rejecting state is visited infinitely often. 
If we now restrict the number of visits to these rejecting states along a run to some finite value b, and find a 
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system for which all automaton runs for all traces of the system visit the rejecting states of the automaton 
only at most b times, then we have a valid implementation. At the same time, the problem of synthesizing 
such solutions can be reduced to safety game solving, which makes the approach conceptually simple. 

Here, we consider two variants of building the games from specifications. The first one uses the 
classical construction from |[T5l . adapted to finding Mealy machines instead of Moore machine imple- 
mentations. In the second one, we use a modification proposed in [6]: we allow the system player to 
voluntarily put herself into an unnecessarily bad situation in the game. In a bounded synthesis game, 
positions are labelled by some counter vector (ci , . . . , c„), which are updated whenever both players have 
made their moves. The positions have the property that for two positions v and v' labelled by (ci , . . . ,c„) 
and (cj , . . . ,cj,) such that for every / G {1, . . . ,«}, we have c,- > cj, all of player O's winning strategies 
for v'"" = V are also winning strategies for the same game but with v'"" = v'. Thus, by allowing player 
to increase her counters voluntarily, we do not give her additional power. Additionally, we introduce 
a position for player to increase her counter values from the initial ones before the actual start of the 
game. While this modification does not give player more possibilities to win the game, it allows us 
to find sparser strategies. In fact, it is a corollary of Theorem 2 of [15] that if and only if there exists 
some Mealy machine with n states that satisfies the specification and adheres to a bound of b, then the 
bounded synthesis game with the counter increase possibility for player will allow a strategy of den- 
sity « • I + 1. Thus, searching for the sparsest positional strategy will lead to the smallest Mealy-type 
implementation. Note that strictly speaking, the safety games resulting from the modification do not 
conform to the safety game definition in Section |2] any more, as the counter increasing possibility leads 
to multiple successors that all correspond to the same action for some positions of player 0. However, 
for approaches to find sparse positional winning strategies, this makes no difference. For both variants 
of Bounded Synthesis, we consider the following benchmarks: 

• a basic mutex (BasicMutex), for the linear-time temporal logic (LTL) specification i/a = G(ri — > 
Fgi) AG(r2 Fg2) AG(-.gi V-.g2), the input bits {ri,r2}, and the output bits {^1,^2}, 

• a basic reaction scheme {BasicReaction) with the specification i/a = (x — G-iz) A {-^x —> Gz) for 
the input bits {x} and the output bits {z}, 

• three dining philosophers {ThreePhilosophers) getting hungry at the same time, with \if = Qs{h ^ 
X(Ff'i A Fe2 A Ff-s)) A G((-i£'i V -^€2) A (-lei V -^e^) A (-•£2 V ^e^)) for the input bit h and the output 
bits {^1,^2,^3} (describing which philosophers are eating), and 

• some examples from ifTTI . mostly arbiter and traffic light examples {demo — v3 ... demo — v23). 
Unrealizable specifications have been left out. 

All benchmarks are parametrized by the bound value. For example, the table entry BasicMutex{3) in the 
following section will refer to the basic mutex example with a bound value of 3. In the case that the sec- 
ond variant of the Bounded Synthesis process is used, in which player has the possibility to voluntarily 
increase some counter values, the benchmark name appears primed, e.g., as in BasicMutex' {3). 

5 Experimental Results 

We implemented the approaches described in Section [3] in C-1-1-, except for the learning approach [fT4l. 
for which we use an implementation provided by the author of |[T4l (also written in C-i~i-). For OPTSAT 
||9ll and the learning-based tool, we used default settings. As (integer) linear programming library, we 
took LIBLPSOLVE V.5.5. 
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For obtaining the benchmarks, we implemented a tool that computes safety games for the Bounded 
Synthesis approach, without using any symbolic data structure such as binary decision diagrams (BDDs). 
Benchmarks for which the preparation required more than 64 gigabytes of RAM were left out. This limit 
was frequently exceeded for the modified Bounded Synthesis approach, as many of the resulting games 
have a huge number of positions, even though the percentage of positions that are winning for the safety 
player, and thus are input to the strategy density optimization algorithms, is quite low. Typically, we 
scaled the bound for the synthesis benchmarks up to 5. If the number of winning positions in a game 
exceeds 10000 for some bound b, or if increasing the bound would yield the same game, we did however 
not consider higher bounds. All games were pruned to the positions reachable when player follows 
some arbitrary specialization of the most permissive strategy. 

We used a Sun XFire computer with 2.6 Ghz AMD Opteron processors running an x64-version of 
Linux for obtaining the results. All tools considered are single-threaded. We restricted the memory usage 
for the strategy extraction to 4 GB and set a timeout of 600 seconds per invocation. All tools were ran 
five times (25 times for the randomized approaches) to level out fluctuations. The tables in the following 
represent mean values. 

5.1 Strategy Densities 

Table[T]and Table |2] compare the obtained strategy densities (or sizes for non-positional strategies) on the 
classical Bounded Synthesis games, whereas Table [3] considers the Bounded Synthesis benchmarks with 
the modification that player can increase counter values at will. Timeouts are represented by "t/o". 
Since for the modification switched on, building the safety games resulted in running out of memory in 
many cases. Table [3] only has relatively few entries. The remaining benchmarks have a low to medium 
number of positions, as the non- winning positions have already been pruned away, and these constitute 
the majority of positions created while building the game. However, the large search spaces and the bad 
performance of the purely random strategy extraction approach show that the benchmarks are still far 
from being trivial. The search space size (in bits) represents how many syntactically different positional 
strategies are possible, and is defined to be LveVo^°§2(l{-^ € iP \ E{v,x) ^W\) for the set of winning 
positions W. Quite often, the sparsest strategies only have a very low density. This is not a surprising 
situation in synthesis, as many systems can be implemented in very few states. The combination of large 
search spaces and the availability of sparse winning strategies make the benchmarks at hand an excellent 
competition ground for the sparse strategy extraction approaches. To compare the density of positional 
strategies and the size of learning-based strategies, for all tables, the number of input and output atomic 
propositions in the benchmark are also given. 

It can be seen that for both Bounded Synthesis variants, the randomized approach and the repetitive 
linear-programming approach are quite competitive against the exact minimization approaches (ILP and 
OPTSAT), despite the large search space. For many benchmarks for which very sparse strategies are pos- 
sible (e.g., demo — v8,demo — vl2,demo — vl3), all of the approaches dealing with positional strategies 
find some sparsest strategy. Furthermore, there is no clear winner of the smart randomized approach and 
repetitive linear programming. For example, for the basic mutex (unprimed), the latter approach always 
finds a sparsest strategy, whereas the randomized approach does not. On the other hand, for the dining 
philosophers (unprimed) and other benchmarks like demo — v9, the situation is reversed. 

When evaluating how well the computational learning approach works, we need to compare across 
tables. In Table [3l the approach is not listed. The reason is that due to the counter increase option of 
player 0, there can be many successors in the game that all correspond to the same action, and the im- 
plementation of the approach does not support such games. However, since the learning approach can 
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already find the smallest strategy in the games produced in the classical Bounded Synthesis approach, 
this is no drawback. Recall from Section|4]that if and only if there exists a Mealy machine with n states 
that satisfies the specification and respects some bound b, then in the modified synthesis game for bound 
b, there is a positional strategy with density « • | + 1. This fact allows us to measure the success of 
the learning approach. We can see than in most cases, it did not find the minimal implementation. For 
example, for BasicMutex' {1) , the sparsest positional strategy has density 9, i.e., « • |r'| + 1 for « = 2. 
Intuitively, for this benchmark, a Mealy machine with two states that satisfies the specification would 
simply alternate between giving the grant to the two requesters. The Mealy machine sizes for the learn- 
ing approach and BasicMutex{b) for = {1, ... ,5} are however larger, and grow with the values of b. 
Thus, the learning approach can be fooled by needlessly large games. However, for benchmarks such as 
demo — v22(l), for which the modified version of the game was too large to fit into 64 GB of memory, the 
learning approach can deal well with the classical version of the game: a Mealy machine with 13 states 
is found, although the sparsest positional strategy has 201 reachable positions of , for | = 8. The 
benchmark demo — v22 represents an elevator controller synthesis problem. For comparison, the num- 
bers of states of the deterministic finite automata produced from the benchmarks in the learning-based 
approach are also given in Table [Hand Table El 

5.2 Computation Times 

Table HI presents computation times for the classical Bounded Synthesis benchmarks, whereas Table [5] 
describes the results for the modified version. For brevity, benchmarks for which all tools needed less 
than 50 ms of computation times have been left out. 

The tables show no big surprises. The exact approaches time out for the largest benchmarks. For the 
benchmarks stemming from the modified Bounded Synthesis version, OPTS AT performs better than the 
ILP-based approach, whereas for the non-modified version, the ILP solver seems to be faster. The main 
difference between the two classes is the fact that the number of successors of positions of player is 
much higher in the modified synthesis games. OPTSAT seems to be able to deal with this situation in 
a better way. The learning approach is typically slower than the heuristics for positional strategies, but 
unlike the exact approaches, did not time out for any of the benchmarks. 

5.3 Robustness of the Approaches 

So far, we have only been concerned with the mean strategy densities (or sizes for non-positional strate- 
gies) and computation times. For practical use, it is also of importance that the fluctuations in both of 
these values are as low as possible. As we ran all benchmark/tool combinations 5 or 25 times, we can 
analyse the standard deviation for the strategy densities and times here. 

In terms of strategy size, the ILP and and OPTSAT approaches have no fluctuations (as they are 
precise), and the strategy densities stemming from the purely random approach have a high standard 
deviation of up to 375 for demo — vl9' {\). However, typically, this value is between 2 and 50. The 
learning approach works deterministically and always returned the same result for an input safety game. 
For all benchmarks except for demo — vl6 to demo — v20 and demo — v22, the standard deviations for 
the smart randomized approach are below 8, and for the repetitive LP approach, below 2.1. For these 
benchmarks, the repetitive LP approach appears to be more robust, as demonstrated by Tabled which 
shows the standard deviations for demo — vl6 and demo — vi7 as examples. 

As far as the time is concerned, all approaches except for the smart randomized one are quite robust 
and have standard deviations in their computation times that are typically lower than five percent of the 
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Table 1: Strategy density /size comparison for the classical Bounded Synthesis benchmarks (part one) 
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Table 2: Strategy density/size comparison for the classical Bounded Synthesis benchmarks (part two). 
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21.2 


10 


6 


demo — v21(l) 


97 


6 


16 


16 





97 


97 


97 


97 


97 


31 


6 


demo — v21 (2) 


417 


26 


16 


16 


20 


417 


97 


97 


97 


97 


111 


13 


demo — v2 1(3) 


2017 


126 


16 


16 


155.098 


2017 


97 


97 


97 


97 


333 


74 


demo — v22(l) 


353 


44 


2 


8 


40 


314.3 


201 


201 


201 


201 


28 


13 


demo — v22(2) 


505 


63 


2 


8 


139 


454.4 


201 


201 


227.9 


201 


37 


18 


demo — v22(3) 


633 


79 


2 


8 


211 


582.1 


201 


201 


229.5 


201.4 


47 


23 


demo — v22(4) 


761 


95 


2 


8 


268 


703.1 


201 


201 


228.5 


202.8 


57 


28 


demo — v22(5) 


889 


111 


2 


8 


325 


816.4 


201 


201 


216.4 


204 


67 


33 


demo — v23{\) 


27 


13 


2 


2 


11 


20.44 


15 


15 


15.56 


15 


10 


5 



Table 3: Strategy density comparison for the Bounded Synthesis benchmarks, with modification to allow 
for sparser strategies. 



Benchmark 


|V0| 








Search space 
(bits) 


Random 
(dumb) 


ILP 


OPTSAT 


Random 
(smart) 


RepLP 


BasicMutex' {\) 


33 


8 


4 


4 


66 


28.04 


9 


9 


9 


9 


BasicMutex' (2) 


121 


30 


4 


4 


414.762 


60.04 


9 


9 


9 


9 


BasicMutex' (3) 


289 


72 


4 


4 


1298.79 


130.8 


9 


9 


9 


9 


BasicMutex' {A) 


561 


140 


4 


4 


2998.44 


214.6 


9 


9 


9 


9 


BasicMutex' (5) 


961 


240 


4 


4 


5816.8 


348.4 


9 


9 


9 


9 


BasicReaction' ( 1 ) 


13 


6 


2 


2 


7 


7.88 


7 


7 


7 


7.2 


BasicReaction' (2) 


19 


9 


2 


2 


17.0947 


7.96 


7 


7 


7 


8 


BasicReaction' {i) 


25 


12 


2 


2 


29.5098 


7.88 


7 


7 


7 


7.2 


BasicReaction' (A) 


31 


15 


2 


2 


43.7633 


9.64 


7 


7 


7 


7.4 


BasicReaction' (5 ) 


37 


18 


2 


2 


59.5361 


9.24 


7 


7 


7 


8 


ThreePhilosophers' {?>) 


217 


108 


8 


2 


865.396 


53.4 


7 


7 


7 


7 


ThreePhilosophers' (A) 


785 


392 


8 


2 


4246.45 


83.16 


t/o 


7 


7 


7 


Tiircc Philosophers f5) 


1921 


960 


8 


2 


12504.8 


1 14.3 


t/o 


7 


7 


7 


denu> — vH' \ \ ) 


37 


IS 






136.287 


16.44 






3 


3 


demo-v8'{2) 


97 


48 


2 


2 


472.304 


31.24 


3 


3 


3 


3 


demo-v8'{3) 


201 


100 


2 


2 


1164.66 


56.92 


3 


3 


3 


3 


demo-v8'{4) 


361 


180 


2 


2 


2365.96 


83.64 


3 


3 


3 


3 


demo — v8'{5) 


589 


294 


2 


2 


4239.85 


118.5 


3 


3 


3 


3 


demo — 1'9' ( 1 ) 


649 


324 


2 


2 


2517.11 


39.32 


5 


5 


5.24 


6 


demo — v9'{2) 


4609 


2304 


2 


2 


24739 


71.32 


t/o 


5 


5.56 


6.2 


demo -vl 3' (I) 


55 


27 


2 


2 


259.934 


42.2 


3 


3 


3 


3 


demo — vl3'{2) 


129 


64 


2 


2 


773 


105.8 


3 


3 


3 


3 


demo-vl3'{3) 


251 


125 


2 


2 


1747.67 


196.9 


3 


3 


3 


3 


demo -vl 3' (4) 


433 


216 


2 


2 


3357.28 


347 


3 


3 


3 


3 


demo-vl3'{5) 


687 


343 


2 


2 


5785.47 


547.3 


3 


3 


3 


3 


demo-vl5'{\) 


169 


42 


4 


4 


542.836 


104.2 


13 


13 


13.96 


14.6 


demo — vl 5' (2) 


769 


192 


4 


4 


3614.23 


261 


13 


13 


14.28 


14.6 


demo -V 19' (\) 


2241 


560 


4 


4 


12946.8 


781.5 


9 


9 


9.8 


9 


demo — v23' ( 1 ) 


2521 


1260 


2 


2 


14742 


186.8 


t/o 


5 


6.84 


5 
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Table 4: Computation time comparison for the classical Bounded Synthesis benchmarks. All times are 
given in seconds. 



Benchmark 


(dumb) 


ILP 


OPTSAT 


Random 
(smart) 


RepLP 


Lcsming 


BasicMutex(4) 


0.0067 1 


0.00948 


0.0675 1 


0.00804 


0.0204 


0.038 


Ba\icMitle.\'{5 ) 


0.00865 


0.0109 


0.4895 


0.00736 


0.0107 


0.07821 


ThreePhilosophers (4) 


0.00795 


2.58 


3.241 


0.00847 


0.059 


0.04376 


ThreePhilosophers (5 ) 


0.00925 


129 


89.7 


0.0114 


0.241 


0.114 


demo — vJ(2) 


0.00659 


0.0132 


0.0641 


0.0117 


0.0262 


0.02546 


demo — v(5(3) 


0.00874 


0.0427 


0.04663 


0.0335 


0.0584 


0.06382 


demo — v]0{2) 


0.00629 


0.00974 


0.07321 


0.00747 


0.00999 


0.01083 


demo — vlO{5) 


0.00831 


0.0165 


0.1267 


0.00979 


0.0151 


0.07951 


demo — vJ4{2) 


0.00675 


0.0133 


0.05853 


0.00664 


0.0123 


0.03176 


demo — vI4(3) 


0.00845 


0.0174 


0.04967 


0.0104 


0.0156 


0.1244 


demo — vl4(A) 


0.00896 


0.0237 


0.3457 


0.0112 


0.0189 


0.4291 


demo — vl4{5) 


0.0107 


0.0337 


4.473 


0.0134 


0.0261 


1.271 


demo — v75(4) 


0.00677 


0.0177 


0.1765 


0.00989 


0.0167 


0.0701 


demo — vl5{S) 


0.00834 


0.0266 


0.1027 


0.00938 


0.0187 


0.1399 


demo — vl6{2) 


0.00903 


0.408 


0.2805 


0.035 


0.555 


0.1248 


demo — v76(3) 


0.0144 


84.9 


86.47 


0.0589 


1.46 


0.846 


demo — v76 (4) 


0.0219 


548 


t/o 


0.111 


3.24 


5.05 


demo — vI6{5) 


0.0325 


t/o 


t/o 


0.214 


8.11 


19.91 


demo — v77(2) 


0.00762 


0.0345 


0.1598 


0.00947 


0.0205 


0.1075 


demo — v}7{3) 


0.0123 


0.509 


16.84 


0.017 


0.0906 


1.963 


demo — v7 7(4) 


0.0245 


2.02 


t/o 


0.0455 


0.253 


20.97 


demo — vl7{5) 


0.0353 


4.45 


t/o 


0.0691 


0.571 


163 


demo — viS(3) 


0.0885 


t/o 


t/o 


0.917 


92.8 


247.5 


demo — v79(3) 


0.00761 


0.0142 


0.2191 


0.00896 


0.0156 


0.06583 


demo — vl9{4) 


0.00764 


0.0173 


1.272 


0.0106 


0.0193 


0.1379 


demo — v79(5) 


0.00929 


0.0219 


1.51 


0.0127 


0.022 


0.2704 


demo — v20( 1 ) 


0.0127 


0.0323 


0.9498 


0.0131 


0.0377 


0.01549 


demo — v20(2) 


0.0206 


0.222 


15.81 


0.0288 


0.197 


0.06499 


demo — v20(3) 


0.0355 


0.929 


479.8 


0.0715 


0.729 


0.04946 


demo — v20(4) 


0.0634 


3.52 


t/o 


0.168 


2.14 


0.08163 


demo — v2 1(2) 


0.00909 


0.0216 


0.09432 


0.0148 


0.0216 


0.584 


demo — v2/(3) 


0.0283 


0. 1 36 


2.672 


0.0714 


0.156 


27.07 


demo — r22( 1 ) 


0.00896 


0.0302 


0.104 


0.0244 


0.116 


0.0816 


demo — v22 (2) 


0.0107 


0.0504 


0.2223 


0.0371 


0.138 


0.1752 


demo — v22{3) 


0.0127 


0.0683 


1.801 


0.0458 


0.341 


0.2638 


demo — v22 (4) 


0.0138 


0.0817 


13.89 


0.0519 


0.522 


0.4677 


demo — v22{5) 


0.0153 


0.154 


4.138 


0.0547 


0.608 


0.7734 



mean computation times, except for very small benchmarks. The smart randomized approach also has 
a low standard deviation in the computation times, except for the benchmarks with a high fluctuation in 
the strategy densities. For example, for demo — v20 through demo — v22, the standard deviation is about 
20 percent of the mean computation time for the larger bounds. 



6 Conclusion 

We performed an experimental evaluation of currently available methods to obtain sparse winning posi- 
tional strategies from safety games, and compared positional strategy finding against a recent computa- 
tional learning approach for non-positional strategies. 

The evaluation shows that for the explicitly represented games that stem from synthesis problems, 
precise methods such as applying the OPTSAT tool, or using an ILP solver is competitive in terms of 
computation time, although for the larger benchmarks, heuristic methods may be the only sensible way 
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Table 5: Computation time comparison for the Bounded Synthesis benchmarks, with modification to 
allow for sparser strategies. All times are given in seconds. 



B cnchiTid rk 


Rsndom 


11 p 


OPXC AT 






RnKirMiiT(-'\-' (X\ 


011 9 


"'01 





0177 


()5'^8 


HllKil'MlltU v' i S\ 
DLlCilL lyi ill L.\ \ ~t 1 


0^7^ 


1 8i 


1118 


107 


224 


DLlAlClVlUlliA yJ ) 


0728 


20 1 


1 535 


66 


2.06 


1 III t-Cl flllUAUlJllC t J {■J ) 


0.0109 


0.0628 




0.018 


0.0426 


Th rc € Phi I OS op he r.v ( 4 J 


0525 


t/n 


1 1 


329 


1 09 


ThrcaPhilosophers (5 J 


197 


t/o 


5 147 


4 6 


14 4 


/^h-'uin — vH'('^^ 

ClClIK/ \J} 1 


0143 


049 1 




0241 


0456 


demo — v8' (4) 


0.0333 


0.365 


0.1345 


0.161 


0.193 


demo-vS'ls) 


0.055 


1.6 


0.5534 


0.867 


1.51 


demo-v9'{\) 


0.0218 


12 


0.7623 


0.0675 


0.466 


demo-v9'{2) 


0.477 


t/o 


130.8 


15.2 


248 


demo — v/i' (3) 


0.0246 


0.0773 


0.05977 


0.136 


0.0856 


demo-vl3'['() 


0.0666 


0.745 


0.1788 


0.973 


0.814 


demo-vl3'{5) 


0.183 


2.69 


0.9863 


2.94 


3.32 


demo — vl5'{l) 


0.0077 


0.0742 


0.08062 


0.0101 


0.0246 


demo-vl5'{l) 


0.0235 


44.1 


5.561 


0.0674 


0.477 


demo-vl9'(\) 


0.131 


293 


39.27 


2.59 


13.5 


demo — v23'(\) 


0.335 


t/o 


33.45 


14.1 


16.9 



Table 6: Standard deviations of the strategy densities/sizes for a selection of benchmarks 



Benchmark 


Random (dumb) 


ILP 


OPTSAT 


Random (smart) 


RepLP 


Learning 


demo — vl6(2) 


10.59 








11.54 


0.7483 





demo — vl6{3) 


39.88 








16.98 


2.608 





demo — vl6{4) 


47.57 





t/o 


40.67 


4.147 





demo — vl6{S) 


94.58 


t/o 


t/o 


48.64 


2.638 





demo-vl7(2) 


11.42 








9.074 








demo — vl7{3) 


31.5 








16.21 


2.245 





demo -vl 7 (4) 


50.51 





t/o 


38.12 


0.4899 





demo — vl7{5) 


63.23 





t/o 


42.29 


1.939 






to go. For the heuristic methods, the smarter version of the randomized method is surprisingly good and 
comparable to the repetitive linear programming approach in terms of quality of the results. A possible 
reason for this good performance is that both method always find local optima. 

The learning approach to obtain non-positional strategies has shown its potential. While for most 
benchmarks, the strategies found by the approach were much larger than the densities of the positional 
ones, for others, a non-positional strategy representation that is much smaller than the density of the 
sparsest positional strategy was found. 

For this paper, we have deliberately taken a relatively simple game model: explicit safety games. 
The results at hand however induce implications for more complex game types, such as symbolically 
represented safety or parity games: the good performance of heuristics that find local minima shows 
that the idea, although simple, has some potential, and gives rise to the question how this idea can be 
transferred to the world of symbolically represented games. 
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